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The ability to engineer enzymes and other proteins to any 
desired stability would have wide-ranging applications. 
Here, we demonstrate that computational design of a 
library with chemically diverse stabilizing mutations 
allows the engineering of drastically stabilized and fully 
functional variants of the mesostable enzyme limonene 
epoxide hydrolase. First, point mutations were selected if 
they significantly improved the predicted free energy of 
protein folding. Disulfide bonds were designed using sam- 
pling of backbone conformational space, which tripled the 
number of experimentally stabilizing disulfide bridges. 
Next, orthogonal in silico screening steps were used to 
remove chemically unreasonable mutations and mutations 
that are predicted to increase protein flexibility. The result- 
ing library of 64 variants was experimentally screened, 
which revealed 21 (pairs of) stabilizing mutations located 
both in relatively rigid and in flexible areas of the enzyme. 
Finally, combining 10-12 of these confirmed mutations 
resulted in multi-site mutants with an increase in apparent 
melting temperature from 50 to 85°C, enhanced cataly- 
tic activity, preserved regioselectivity and a >250-fold 
longer half-life. The developed Framework for Rapid 
Enzyme Stabilization by Computational libraries (FRESCO) 
requires far less screening than conventional directed 
evolution. 

Keywords: enzyme stability/m silico design/i'n silico 
screening/protein stability engineering/thermostability 



Introduction 

Metabolic engineering and industrial biocatalysis increasingly 
need protein engineering of enzymes to provide the desired 
catalytic properties, such as regio- and stereospecificity, result- 
ing in high product yields and low losses to side products. For 
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applied biocatalysis, an ideal enzyme has a long shelf life and 
is stable under practical process conditions, which often 
includes high temperatures that are needed to solubilize sub- 
strates and prevent microbial contamination. Mutations that 
provide a gain of function often decrease stability, and more 
than a few of such mutations in a mesostable enzyme result in 
the loss of folding and expression (Bloom et al, 2006; 
Besenmatter et al, 2007; Tokuriki and Tawfik, 2009). To 
improve protein function by mutagenesis, thermostable start- 
ing points are preferred, but these are often not available from 
natural biodiversity. For these reasons, methods to improve the 
stability of enzymes and other proteins are highly relevant 
(Schmid et al, 2001; Eijsink et al., 2004; Bommarius et al, 
2006; Bornscheuer et al, 2012). 

Unless thermostability is associated with reversible unfold- 
ing, it is difficult to stabilize a protein by site-directed muta- 
genesis. For proteins that do unfold reversibly, there is an 
equilibrium between the folded and unfolded states and the 
effects of mutations on stability can be modeled relatively ac- 
curately. Computational design can produce highly stabilized 
variants of such model proteins (Malakauskas and Mayo, 
1998; Borgo and Havranek, 2012). However, for most proteins 
inactivation is essentially irreversible, often triggered by an 
initial unfolding of a particular region of the protein (Eijsink 
et al., 2005). Also, due to the kinetically complicated mechan- 
isms involved, the effects of mutations on stability are hard to 
predict (Eijsink et al., 2004; Polizzi et al, 2007). For typical 
proteins, these complications make it challenging to engineer 
major stability increases. 

Existing protein stabilization strategies normally yield only 
2-15°C increase in thermostability of enzymes (Williams 
et al, 1999; Vazquez-Figueroa et al, 2007; Wijma et al, 
2013), which is very modest when compared with the ranges 
of thermostability observed in natural enzymes. Directed evo- 
lution can be applied to improve the stability of enzymes by 
introducing (random) mutations in the coding gene and screen- 
ing large libraries (typically >10 4 variants) to find the rare 
mutations that improve thermostability (Giver et al, 1998; 
Bommarius et al, 2006; Reetz et al, 2006; Turner, 2009). A 
serious shortcoming of such a random approach is that it can 
only be applied to enzymes for which high-throughput expres- 
sion and activity screens are available. Other methods to sta- 
bilize enzymes are consensus design (Lehmann and Wyss, 
2001; Bommarius et al, 2006), rational protein engineering 
(Eijsink et al, 2004), the creation of chimeric enzymes 
(Romero et al, 2013) and computational design (Korkegian 
et al, 2005; Gribenko et al, 2009; Joo et al, 2011). The 
number of stabilizing mutations that are introduced is usually 
rather low and currently none of these methods work well 
enough to reliably achieve a large stability increase of a target 
enzyme. 

Here, we present a strategy aimed at dramatically improving 
the thermostability of an enzyme by computational design. 
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Scheme I. FRESCO. In Step 1, stabilizing mutations are generated with 
multiple algorithms. The in silico screening Steps 2 and 3 remove false 
positives. In Step 2, variants are eliminated which have properties that are 
known to typically decrease thermostability, such as increased hydrophobic 
surface exposure to the water phase or an increased number of unsatisfied 
H-bond donors and acceptors (for details, see the Materials section and the 
Results section). Step 3 eliminates variants in increased flexibility (an example 
is shown in Supplementary Fig. S3). An experimental screening (Step 4) is 
used before combining the most stabilizing mutations in Step 5. Details 
regarding Step 5 are described in the Results section. 



Our idea was that existing protein stabilization methods are 
mainly limited by their inability to find more than a few stabil- 
izing mutations. Thus, a more successful stabilization method 
should generate many stabilizing mutations in a short time. 
The developed stabilization procedure (Scheme 1) employs 
computational methods to predict a large number of independ- 
ent stabilizing mutations. Subsequent in silico screening steps 
eliminate chemically unreasonable mutations as well as muta- 
tions which increase protein flexibility. This reduces the 
number of variants that need to be screened in vitro. The 
selected mutations are tested experimentally, and the most sta- 
bilizing mutations are combined to obtain a highly thermo- 
stable enzyme variant. This strategy is referred to as 
Framework for Rapid Enzyme Stabilization by Computational 
libraries (FRESCO, Scheme 1). 

To explore this strategy, limonene epoxide hydrolase (LEH) 
from Rhodococcus erythropolis DCL14 was selected as it is a 
target for protein engineering, aimed at improving its applic- 
ability in the production of chiral building blocks (Zheng and 
Reetz, 2010). Further efforts to engineer the substrate specificity 
would benefit from the availability of a thermostable enzyme 
(Bloom et al, 2006; Besenmatter et al, 2007; Tokuriki and 



Tawfik, 2009), but the T$ p of wild-type (WT) LEH is only 
50°C. We show that, by applying the described FRESCO strat- 
egy, it is possible to produce extremely stabilized enzyme var- 
iants (Iffi +35°C) that are still fully functional. 



Methods 

Computational 

The relative changes in folding free energy AAG Fold due to 
point mutations and the 3D structures of the corresponding 
mutant enzymes were predicted with FoldX (foldx.crg.es) 
(Guerois et al, 2002) and with Rosettaddg (www. 
rosettacommons.org) (Kellogg et al, 2011) on the basis of the 
known LEH X-ray structure 1NWW (Arand et al, 2003). The 
predicted AAG Fo d equals the AG Fold for the protein carrying 
the point mutation minus the AG Fold for the WT protein. For 
FoldX, the standard settings of the software, which had been 
optimized on a large test set, were used, except that the calcu- 
lation was repeated five times to obtain a better averaging. For 
Rosettaddg, we used the algorithm described by Kellogg et al. 
(2011) which includes repacking within 8 A of the mutated 
residue using a soft-repulsion energy function (options -ddg:: 
local_opt_only true -ddg::opt_radius 8.0 -ddg::weight_file 
soft_rep_design -ddg: iterations 50 -ddg::min_cst false -ddg:: 
mean true -ddg::min false -ddg::sc_min_only false -ddg:: ram- 
p_repulsive false). To avoid mutations that are likely to inter- 
fere with catalysis or substrate binding, only residues that were 
> 10 A away from the active-site-bound heptamide ligand in 
1NWW (Arand et al, 2003) were allowed to mutate. 

Selection of potentially stabilizing mutations was based on 
the following two criteria. Any substitution would be selected 
if its predicted AAG Fold was <— 5kJmol _1 , which corre- 
sponds to the approximate error (3.3 kJ mol -1 in AAG Fold pre- 
dictions with FoldX (Guerois et al, 2002). For Rosettaddg, no 
error was reported (Kellogg et al, 2011), but since the correl- 
ation coefficients with experimental data were similar to those 
reported for FoldX, we assumed that Rosettaddg has a similar 
error. If the mutation had no significant effect, i.e. its predicted 
AAG Fold was in the range of —5 to +5 kJ mol -1 , then it was 
still selected if it belonged to one of the following types of 
mutations that are often observed to be stabilizing, i.e. 
XXX^Arg (Mrabet et al, 1992; Kumar et al, 2000; 
Sokalingam et al, 2012), XXX^Pro, Gly^XXX (Nosoh and 
Sekiguchi, 1991). However, none of these —5 to +5 kJ mol -1 
selected variants were experimentally stabilizing (see Fig. 1A 
and B). 

The newly written Dynamic Disulfide Discovery (DDD) al- 
gorithm uses for the design of disulfide bonds an ensemble of 
structures that are the snapshots from a molecular dynamics 
(MD) simulation. For all input structures, the algorithm itera- 
tively searches for residues that are within 7 A but more than 
15 positions away in the primary sequence. If such a neighbor- 
ing residue is found, the algorithm introduces multiple initial 
geometries of disulfide bonds with dihedrals 6\ for both donor 
and acceptor cysteine of —60°, 60°, and 180° (thus nine differ- 
ent combinations). These starting structures are energy mini- 
mized with fixed backbone atoms, and the resulting structure 
is analyzed for molecular mechanics energy of the sulfur 
atoms (to eliminate unnaturally strained disulfide bonds) and 
geometry (to eliminate disulfide bonds with uncommon 
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Fig. I . Experimentally characterized point mutants of LEH. Protein variants 
that are significantly more thermostable are labeled (e.g. T85V). The 
abbreviation NE stands for no soluble expression. The gray background is 
used to distinguish the mutations with a predicted AAG Fod that does not 
significantly differ from zero ( — 5 to +5 kJ/mol). The variants that would not 
survive Steps 2 or 3 are plotted with different symbols as indicated in the inset. 
(A) The point mutations that were predicted to be stabilizing using Rosettaddg 
and also survived Steps 2 and 3 in FRESCO (Scheme 1). (B) Idem for the 
point mutations predicted to be stabilizing using FoldX; (C) Library 
characterized for a control experiment, in which the effects of omitting Steps 2 
and 3 of the FRESCO protocol (Scheme 1) were tested. The best FoldX 
variants were selected, with maximally one mutation per position (thus, only 



T85I with a AAG 
mol). 



of -20kJ/mol, not T85V with AAG of -14kJ/ 



geometries, criteria described below). If a disulfide bond 
passes all these criteria, it is selected. 

The geometric criteria for disulfide bonds (Supplementary 
Fig. SI, Table SI) were selected by us based on the geometries 
observed for a large set of disulfide bonds in the protein data 
bank (Petersen et al, 1999; Pellequer and Chen, 2006). An 
energy criterion for the maximal molecular mechanics energy 



of the disulfide bond (lOkJmol -1 ) was adopted, which in a 
test set appeared to identify most of the existing disulfide 
bonds. For the developed algorithm, 12 out of the 14 disulfide 
bonds in a small test set consisting of X-ray structures 1CC5, 
1CPO, 1CRN, 1HNF, 1HXN, 1QBA, 1RLR and 2LBP were 
acceptable. While adopting more lenient criteria would allow 
acceptance of all the existing disulfide bonds in the dataset, 
such relaxed criteria are also expected to result in a higher 
fraction of false positives. 

MD simulations were carried out to predict the backbone 
flexibility of WT and designed variants after designs with 
clearly unreasonable features, expected to destabilize the 
protein (Nosoh and Sekiguchi, 1991), were filtered out by 
visual inspection. The encountered unreasonable features are 
quantified in the Results section. Simulations were carried out 
under Yasara with the Yamber3 force field, which is an Amber 
ff99 (Wang et al., 2000) derivative that has been specifically 
parameterized for increased structural accuracy (Krieger et al, 
2004). A rectangular simulation box was used (with periodic 
boundary conditions, extended 7.5 A around the protein fully 
solvated in explicit water with sodium chloride counter ions 
added to a concentration of 0.5%). Long-range (>7.86A) 
electrostatic interactions were modeled with a particle mesh 
Ewald algorithm (Essmann et at, 1995) with fourth degree 
B-spline functions. To remove clashes and conformational 
strain, an energy minimization was carried out before each 
MD simulation. This energy minimization was continued until 
the total energy decreased by < 0.05 kJ mol -1 , which was 
tested every 400 fs. The time step during the simulations was 
1.25 fs with the electrostatics and Lennard- Jones interac- 
tions updated once every two time steps. All MD simula- 
tions started with an energy-minimized structure, which was 
heated from 5 to 298 K in 30 ps. MD simulations were 
started with the original crystal water present. A Berendsen 
thermostat was used to control the temperature (Berendsen 
et al, 1984) under an NPT ensemble (number of particles, 
pressure, and temperature are constant). The time constant Tof 
the temperature coupling was set to 0. 1 ps. To improve repro- 
duction of the canonical ensemble, modifications to the 
Berendsen thermostat described elsewhere (Krieger et al, 
2004) were used. 

To analyze a large number of variants for structural flexibil- 
ity, five MD simulations with different initial atom velocities 
(Caves et al, 1998) of 100 (for the individual mutants) or 
1000 ps (for the combined mutants) were carried out. The pre- 
dicted flexibility of the enzyme by MD simulation depends on 
the initial velocities assigned at the start of the MD simulation; 
if different velocities are assigned initially, a different trajec- 
tory is observed (Caves et al, 1998). This provides a better 
sampling of conformations than a single long MD simulation, 
even if sub-trajectories are only 100 ps long (Caves et al, 
1998). The root mean square fluctuation (RMSF) obtained 
from 5 of such 100 ps MD simulations correlated well with 
those from the X-ray structures [Supplementary Fig. S2A and 
B, the RMSF of the crystal structures were calculated from 
their B-factors, with the standard equation RMSF = *J(3 x 
B-factor/8Tr 2 )]. However, the changes in flexibility due to 
mutations were difficult to detect from the RMSF 
(Supplementary Fig. S2C and D), while they could more 
easily be obtained from structural inspection of the simulated 
protein (Supplementary Fig. S3). All the structural flexibility 
effects were analyzed by inspecting the effect of the mutations 
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on the averaged structures obtained from the five different tra- 
jectories per variant (Supplementary Fig. S3). If one out of the 
five MD simulations appeared to sample a very different part 
of the protein conformational space than the other four, it was 
ignored (Supplementary Fig. S4). Removing these outliers 
enabled to compare different variants (Supplementary Fig. S3) 
because otherwise a protein variant that had such an outlier 
appeared to be much more flexible. For example, initially such 
an outlier in the simulation of the WT protein (Supplementary 
Fig. S4) made it appear during the structural inspection as if 
almost all the mutants were significantly more rigid than the 
WT. 

Experimental 

A detailed account of all experimental methods is provided in 
the Supplementary data. A plasmid containing the gene for the 
LEH was kindly provided by Prof. Dr. M. Arand (University 
of Zurich). Mutants of LEH were created by QuikChange mu- 
tagenesis (Agilent, USA) in a 96-well plate. Most enzyme var- 
iants, including those with multiple disulfide bonds, were 
expressed in Escherichia coli TOP10 cells (Life Technologies, 
CA, USA). Only variants with single disulfide bonds were 
expressed in E.coli NEB Shuffle Express (New England 
Biolabs, USA). The mutations were validated by DNA se- 
quencing. Purification was carried out on Ni-NTA columns 
with a C-terminal hexa-histidine tag using cell lysate prepared 
from cells grown in 1 1 of Terrific Broth. If needed, further 
purification was carried out by gel filtration (Supradex 75, Life 
technologies, UK). This procedure yielded around 50 mg/1 
protein for variants without disulfide bonds, and 5-10 mg 
protein per liter culture volume for disulfide variants. Proteins 
containing disulfide bonds are usually less well expressed in 
the cytoplasm of E.coli (Marco de, 2009). 

To determine the 7^ PP of variants during screening, the ther- 
mofluor method was carried out essentially as reported else- 
where (Ericsson et al., 2006). The analyzed samples consisted 
of either purified protein or cell-free extract for screening of 
point mutations. For measurements, 5 |xl 100 x diluted com- 
mercial Sypro Orange solution (Life Technologies, CA, USA) 
was added to a 20 |xl protein sample. The apparent melting 
temperature (75yj' p ) was determined by heating the samples 
from 25 to 90°C at l°C/min in a MyiQ real-time PCR machine 
(Bio-Rad, Hercules, CA, USA) while recording the fluores- 
cence with a 490 nm excitation filter and a 575 nm emission 
filter (Ericsson et al, 2006). The maximum of the relative 
fluorescence change with respect to the temperature (dRFU/ 
AT) was taken as the apparent melting temperature (7m p )- 

The presence of inter-subunit disulfide bonds was analyzed 
by examining the migration patterns of the WT and mutant 
proteins by sodium dodecyl sulfate -polyacrylamide gel elec- 
trophoresis (SDS-PAGE), both under reducing and non- 
reducing conditions, since reduction of disulfide bonds should 
cause a shift in migration behavior. To determine both inter- 
and intramolecular disulfide bonds, the number of free 
cysteines, which are not engaged in disulfide bonds, was deter- 
mined using Ellman's reagent [5,5'-dithio-bis-(2-nitrobenzoic 
acid)] (Ellman, 1959). 

Catalytic activities were measured in 4.5 ml potassium 
phosphate (50 mM, pH 7.1), and the enzyme was pre- 
incubated at 30°C for 5 min before adding (4/?)-limonene 
1,2-epoxide (mixture of (IR, 2S,4R) and (IS,2R,4R) isomers) 
as the substrate. After different incubation times, the reaction 



mixtures were extracted with ethyl acetate, centrifuged and the 
organic layers were removed and dried by Na 2 S04. The pro- 
duction of diasteromers was analyzed by chiral GC, using 
a Hydrodex (3-TBDAc column (Aurora Borealis, The 
Netherlands), with a temperature program from 40 to 100°C at 
10°C/min, 100-150°C at 5°C/min and finally 150-180°C at 
l°C/min. The retention times of (l/?,2/?,4/?)-limonene diol and 
(lS,2S,4R)-limonene diol under these conditions were 27.5 
and 29.6 min, respectively. A reference sample with both 
these diols was prepared (Wang et al, 2008). The nearly 
exclusive production of the (15',25',4/?)-limonene diol was 
confirmed by ! H-NMR (200 MHz), using the same peak 
assignments as reported earlier (Blair et al, 2007). 

To measure residual catalytic activity versus temperature, 
enzyme samples (l.Omgml -1 in 50 mM potassium phos- 
phate, pH 7.1) were incubated for 15 min at the desired tempera- 
tures using a peqSTAR gradient polymerase chain reaction 
heating block (Peqlab Biotechnologie GmbH, Erlangen, 
Germany). Samples were allowed to cool down to 4°C for 
1 min, and subsequently their catalytic activity was analyzed. 



Results 

Computational design of enriched libraries 

The first step in the FRESCO strategy for engineering protein 
stability consists of the computational design of potentially 
stabilizing point mutations and disulfide bonds (Scheme 1). 
Point mutations were selected by computational tools that 
predict the resulting change in AG of folding (AAG Fold ). The 
AAG Fold values were calculated with both Rosettaddg and 
FoldX since the underlying algorithms gave significantly dif- 
ferent predictions (Supplementary Fig. S5), resulting in differ- 
ent selected mutations. All residues were allowed to mutate, 
except those inside or near the active site. Of all 1634 evalu- 
ated point mutations, 248 were selected either because they 
were predicted to decrease the AAG Fold <— 5kJmol _1 or 
because they introduced a known type of stabilizing point mu- 
tation, such as those introducing a proline (see the Methods 
section for criteria). Of these 248 point mutations, 48% was 
predicted to be stabilizing only by Rosettaddg, 26% only by 
FoldX and 25% by both algorithms. 

Disulfide bonds were designed employing the newly 
written DDD algorithm, which uses MD simulations to sample 
backbone conformational space. Without this sampling of 
backbone conformational space (i.e. using only the X-ray 
structure), seven disulfide bonds were predicted to be stabiliz- 
ing (Supplementary Table SII). The sampling of different 
backbone positions by the MD simulation (Supplementary 
Fig. S6) resulted in the design algorithm recognizing an 
additional 21 possible disulfide bonds, providing a total of 
28 potentially stabilizing disulfide bonds (Supplementary 
Table SII). 

In the second step (Scheme 1), 130 of the point mutants 
(52%) were eliminated because structural inspection revealed 
features that are typically encountered with destabilizing muta- 
tions (Nosoh and Sekiguchi, 1991). A control experiment 
described below confirmed that this step enriches for stabiliz- 
ing mutations. Furthermore, such visual inspection to filter out 
the unreasonable variants is commonly employed in computa- 
tional protein design (Kiss et al, 2013; Wijma and Janssen, 
2013). Here, the main reasons to eliminate point mutants were 
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that a hydrophobic side chain became surface exposed (70%) 
or that an unsatisfied H-bond donor or acceptor was created 
(20%). Furthermore, all 16 point mutations (12%) of Pro23 
were eliminated because it appeared that the calculations erro- 
neously predicted Pro23 to be unfavorable for folding. Other 
reasons to eliminate variants were because a proline was intro- 
duced inside an ct-helix (4%) or because a hydrophobic 
protein cavity was created (2%). The sum is >100% because 
for 8% of the eliminated point mutants multiple elimination 
criteria applied. Of the disulfide bonds, five (18%) were elimi- 
nated because they created a large hydrophobic cavity. This 
left 118 point mutations (36% from FoldX, 32% from 
Rosettaddg, 31% from both) and 23 disulfide bonds. 

MD simulations on the surviving point mutations and disul- 
fide bond variants (Scheme 1, third step) were used to select 
against variants with increased local flexibility relative to the 
WT, because regions of increased flexibility in a protein are 
more prone to (partial) unfolding leading to inactivation 
(Vihinen, 1987). A control experiment described below con- 
firmed that this step eliminated destabilizing mutations. From 
these MD simulations, it appeared that some of the designed 
variants had a significantly lower local flexibility than the WT 
(Supplementary Fig. S7), which suggests increased stability. 
With the MD simulations, 54% of the 118 variants were elimi- 
nated, which reduced the number of variants that were pre- 
dicted to be stabilized to 64. This included 17 disulfide bonds 
and 47 point mutants of which 21 originated from FoldX, 12 
from Rosettaddg and 14 point mutations, which were predicted 
to be stabilizing by both Rosettaddg and FoldX. 

When these 64 variants were experimentally screened 
(Scheme 1, fourth step; Supplementary Fig. S8, Table SIII), 
21 variants had an improved 7^ p (33%). Of the 17 tested di- 
sulfide bond variants, 10 had an increased T$ p , ranging from 
+4 to +15°C. Seven out of the 10 stabilizing disulfide bonds 
originated from the additional backbone sampling by MD 
simulation (Table I, Fig. 2). Of the 47 point mutations, 11 
were stabilizing (6 from FoldX, 3 from Rosettaddg, 2 from 
both, Fig. 1A and B). Point mutations with a 



Table I. Experimentally confirmed stabilizing disulfide bonds designed using 
crystal structures or conformations from an MD simulation 



AAG 



Fold 



> 



- 5 kJ/mol had also been tested but none of these 



15 were experimentally stabilizing (Fig. 1A and B). Of the 
point mutations with a AAG Fold < - 5 kJ/mol, 34% was sta- 
bilizing. The catalytic activity of the thermostabilized variants 
was preserved, and most activities differ less than a factor 2 
from the catalytic activity of the WT LEH (Supplementary 
Table SIV). 

The discovery of 21 stabilized variants in a library of only 
64 variants is based on the use of the orthogonal in silico 
screening steps (Steps 2 and 3 in Scheme 1) for eliminating 
false-positive predictions. When using FoldX calculations 
only, a large fraction of the mutations predicted to be stabiliz- 
ing appeared to be destabilizing or neutral when tested experi- 
mentally. Selection of the mutations with the best predicted 
AAG Fold using Rosettaddg would only have resulted in the 
introduction of highly surface-exposed aromatic groups, 
which is a known problem in computational design (see the 
Discussion section). The best 18 variants at different positions 
according to FoldX included 6 variants that did pass through 
Stages 2 and 3, and thus also were included in the FRESCO 
library. When the discarded 12 variants were experimentally 
characterized, none were stabilizing (Fig. 1C, Supplementary 
Table SV) and half were strongly destabilizing. Of the 12 
false-positive predictions, 9 were eliminated at Step 2 of 
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FRESCO. Of these nine variants, four were eliminated 
because a highly surface-exposed hydrophobic group was 
introduced (Q7M, E68L, A48F, SHIM), three were elimi- 
nated because unsatisfied H-bond donors and acceptors were 
created by the mutations (S12M, T22D, G129S) and two 
were eliminated because a proline was introduced inside an 
a-helix (A40P, A41P). The other three false positives were 
eliminated at Step 3 of FRESCO because they were predicted 
to have increased flexibility (E49P, Y96W, R9P). The import- 
ance of eliminating false-positive predictions through Steps 2 
and 3 of the screening is also apparent from protein expres- 
sion levels, where lack of soluble expression of a mutant sug- 
gests lack of stability. Whereas in the FRESCO library, only 
2 variants (3%) were not expressed in soluble form, the 
absence of soluble expression was observed for 4 of the 12 
discarded variants. Two of these four variants that could not 
be solubly expressed belonged to the three variants that had 
been eliminated by MD flexibility screening. These results 
confirm that a framework type of computational approach, in 
which orthogonal screening steps serve to eliminate false- 
positive predictions, improves the accuracy with which sta- 
bilizing mutations are predicted and allows the discovery of 
multiple stabilizing mutations with minimal experimental 
screening. 
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Origins of stabilization 

The modeled 3D structures of the improved variants were ana- 
lyzed to investigate the structural basis for the stabilizing 
effects. Because of the large number of stabilizing mutations, 
the effects of individual mutations are described in the 
Supplementary data. The beneficial mutations appear to intro- 
duce better H-bonds that stabilize the local protein structure 
(A19K, N92K), improved surface charge-charge interactions 
(A19K, E45K, T76K, N92K, N92R), improved hydrophobic 
interactions (T85I, T85V, T85L, Y96F, Fig. 3) and entropic 
stabilization (S15P and all disulfide variants). All of the stabil- 
izing mutations are located at or near the surface of LEH. 
Furthermore, the most successful mutations appear to be pre- 
dominantly localized inside or near the flexible N-terminus 
(residues 3-24) and to a much lesser extent in Helices 3 and 4 
(Fig. 4). Many of the mutations are also near to the dimer 
interface, with the exception of the highly stabilizing 
S3C-V102C, K4C-A82C and A5C-E84C disulfide bonds that 
are located at the N-terminus but not at the dimer interface. 
The unsuccessful mutations are more evenly spread over the 
protein than the stabilizing mutations. These observations in- 
dicate that the N-terminal loop is the most critical region for 



stability of LEH, and that the successful mutations especially 
introduced stabilizing interactions in and around this region. 

Design of combined variants 

Aiming to obtain highly thermostable variants, the most stabil- 
izing mutations (Fig. 2) were rationally combined without 
construction of intermediate variants. If at a single position 
multiple stabilizing mutations were available, then the most 
stabilizing variant was used; for example, N92K with a A7m p 
of +7°C was preferred over N92R with a AT$ P of +2°C. 
Since the stability of LEH can be increased by improving the 
local stability of the N-terminal region (see above), different 
combinations of mutations were screened by MD simulation 
for their effect on the flexibility of the N-terminus. Disulfide 
bonds were included and MD simulations were again used to 
test their compatibility. To test the usefulness of the flexibility 
predictions by MD simulations (third step of Scheme 1) for 
variants in which multiple mutations are combined, two 
combinations of disulfide bonds were characterized, of which 
one (S3C/I5C/E84C/V102C) was predicted to rigidify the 
N-terminus and thus be stabilizing, while the other combin- 
ation (A40C/I44C/E68C/A72C) was predicted to increase the 
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Fig. 3. Example of the predicted structure for a stabilizing mutation. The substitution T85V (AT^ = +7°C) removes a hydroxyl group in an apolar environment, 
and replaces it with a more hydrophobic methyl group. The polar side-chain atoms of Thr97 and Arg99 are >5 A from the hydroxyl oxygen of Thr85, which 
excludes hydrogen bonding of Thr85 with those residues, indicating that Thr85 has an unsatisfied H-bond donor or acceptor. 
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N-terminal loop 

Fig. 4. Distribution of stabilizing mutations over the protein (crystal structure 1NWW). (A) B-factors of the C,, atoms of 1NWW (thickest traces with red color 
correspond to the highest B-factors). (B) Location of all the point mutations shown with spheres for which the color reveals the level of stabilization. 
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flexibility of the N-terminal region of the enzyme, and thus be 
destabilizing. Indeed, S3C/I5C/E84C/V102C had a higher 
7^ p than its parents (66.8°C, A7^ P =+15.8°C versus 
+ 13.5°C for I5C/E84C and +11.0°C for S3C/V102C), while 
A40C/I44C/E68C/A72C had a lower 7$ p than its parents 
(54.8°C, A7$[ p = 3.8 versus +5.0°C for A40C/A72C and 
+5.5°C for 44C/E68C). These experimental results are in 
agreement with the predictions from MD simulation, and indi- 
cate that MD simulations can increase the chance of success- 
fully combining mutations. 

The application of this method to combine multiple muta- 
tions simultaneously resulted in the final variants Fl, with 12 
mutations, and F2 with 10 mutations (Fig. 2). These two 
variants combine the strongest stabilizing mutations that were 
predicted by MD to rigidify the N-terminus, whereas combina- 
tions that enhance local flexibility were discarded. For 
example, S15P was omitted from variant F2 because in com- 
bination with the other mutations, an increased flexibility of 
the N-terminal loop was predicted by MD simulation. 
Furthermore, the highly stabilizing mutation N92K was 
omitted from variant F2 because it cannot be combined with 
the A17C/N92C disulfide bonds. Also, maximally two disul- 
fide bridges per enzyme were combined to avoid potential 
problems with the kinetics of protein folding. 

Catalytic properties of combined variants 

When tested experimentally, variants Fl and F2 both exhibited 
a dramatically increased F$ p (A7^ p =+34.6 and 35.5°C, 
Fig. 5A). A variant P which lacked the disulfide bonds of Fl 
still had a +20°C higher TZ P than the WT (Fig. 2). The appar- 
ent melting temperatures of the purified WT enzyme and its 
variants Fl and F2, as measured by differential scanning calor- 
imetry (DSC), were in agreement with results obtained by the 



thermofluor method (Fig. 5A and B). Thermal inactivation 
assays also demonstrated that the variants Fl and F2 are inacti- 
vated only above 80°C (Fig. 5C). Titration with Ellman's 
reagent, the comparison of apparent melting temperature and 
the electrophoretic mobility on SDS-PAGE gel under oxidiz- 
ing and reducing conditions (Supplementary data) confirmed 
that the disulfide bonds were formed in Fl and F2. 
Furthermore, the temperature optimum for catalytic activity of 
the mutants Fl and F2 was significantly increased compared 
with the WT. The WT enzyme had an optimum temperature of 
50°C, whereas the mutants Fl and F2 were most active at 80 
and 70°C, respectively (Fig. 5D). The irreversible inactivation 
of WT, Fl and F2 at 55°C was followed over time (Fig. 5E). 
The fitted rates were 0.31 ± 0.03, <80 x 10" 6 and (130 ± 
40) x 10 _6 min _1 , respectively. This demonstrates that var- 
iants Fl and F2 are inactivated at least 250 times slower than 
the WT. All these results confirm that Fl and F2 are highly 
thermostable variants of LEH. 

Despite the 10-12 introduced mutations, the catalytic activ- 
ities of variants Fl and F2 were retained. Both variants had a 
slightly reduced k C3t at 30°C (Table II, Fig. 6), but they were 
more than twice as catalytically active as the WT at their re- 
spective optimum temperatures (Figs. 5D and 6). Moreover, 
the stereoconvergent selectivity of the WT enzyme, which pro- 
duces enantiopure (\S,2S,AR)-\\monene. diol from a diasterio- 
meric mixture of (IR,2S,4R)- and (15',27?,47?)-limonene 
epoxide substrates (Van der Werf et al, 1999), was preserved 
in variants Fl and F2 (Fig. 5F, Supplementary Fig. S9). 



Discussion 

The results show that the computational design of as many sta- 
bilizing mutations as feasible in combination with in silico 
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Fig. 5. Overview of the improved thermal stability of the LEH mutants Fl and F2. The melting temperatures of Fl (blue) and F2 (red) compared with WT (black) 
as determined by DSC (A) and thermofluor assay (B). (C) The remaining catalytic activity measured at 30°C after pre-incubation for 15 min at the indicated 
temperatures. (D) The 30°C increase in optimum temperature for catalytic activity of these variants and (E) the slower enzyme inactivation by incubation at 55°C. 
(F) Retained regioselectivity of the final variants as determined by chiral GC.The elution profiles are those of the produced limonene diols. A reference sample 
contained both (lS,2fi,4R)-limonene diol at 27.5 min and (lS,2S,4R)-limonene diol at 29.6 min (shown in gray). 
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Table II. Catalytic parameters of WT LEH and variants Fl and F2 



Variant WT Fl F2 



Temperature (°C) 30 50 30 80 30 70 

^(s' 1 ) 13.9 + 0.8 63 +4 8.9 + 0.4 135 + 6 8.2 + 0.3 160 + 7 

K M (mM) 0.3 + 0.1 0.6 + 0.1 <0.25 a 0.6 + 0.1 <0.25 a 0.3 + 0.1 

W^mO^'M -1 ) 4.6 x 10 4 1.0 x 10 s >3.6 x 10 4 2.1 x 10 5 >3.3 x 10 4 4.9 x 10 s 

a K M was below the detection limit of 0.25 mM. The kinetic parameters for the hydrolysis of (4R)-limonene 1 ,2-epoxide were determined both at 30°C and at the 
optimum temperature of WT and variants Fl and F2. 




Substrate (mM) Substrate (mM) 

Fig. 6. Rate of (4/f)-limonene 1,2-epoxide conversion versus its concentration. Wild-type LEH (black circles), variant Fl (blue triangles) and variant F2 (red 
squares) are indicated with different symbols. The fit is according to k t = & cat x [5]/([S] + ^ M ), in which k t is the catalytic turnover rate per enzyme active site and 
[S] is the substrate concentration. The turnover rate is plotted at (A) 30' C and (B) at the optimum temperature for catalytic activity (for WT 50°C; for variant Fl 
80 "C; for variant F2 70°C). At 30 D C, the mutants have a lower catalytic activity compared with the WT, whereas the mutants clearly outperform the WT at the 
optimum temperature. 



and experimental screening allowed for rapid engineering of 
enzyme variants with a dramatically increased thermostability. 
Essential features of the FRESCO strategy proposed here are 
the use of computational design methods to create a library of 
potentially stabilizing mutations, followed by a reduction of 
library size through orthogonal in silico screening aimed at re- 
moving false-positive predictions. An experimental screening 
of the resulting library is then used to select the most stabiliz- 
ing variants for combination. The best combined variants 
exhibited an increase in 7m p of ~35°C as shown by DSC, 
thermofluor assays and activity -temperature measurements, 
while catalytic activity and stereoselectivity were maintained. 
Natural thermostable enzymes are often far less catalytically 
active at lower temperatures than their mesostable homologs 
(Fitter et al, 2001; Cheung et al, 2005). The results here show 
that even a 35°C increase in T$ p is not necessarily accompan- 
ied by loss of catalytic activity at lower temperatures. 

The remarkable stabilization of LEH was achieved by ex- 
perimental testing of < 100 variants in just two rounds of mu- 
tagenesis. This is a very low number when compared with 
directed evolution, where often >10 4 variants need to be 
screened to obtain a strong stabilization, since the vast major- 
ity of the tested mutations are neutral or detrimental for stabil- 
ity (Bloom and Arnold, 2009). In most cases, the screening 
step is the bottleneck of a directed evolution project (Turner, 
2009) and limits its application. Often no rapid expression 
systems or stability assays are available. High-throughput 
screening can be unfeasible if slow-growing organisms are 
required for protein expression or if assays cannot be scaled 
down. Therefore, the protocols presented here will be an at- 
tractive alternative for many proteins. 



The experimental results revealed that a critical region for 
stabilization of LEH is located in the vicinity of the flexible 
N-terminus (Fig. 4B). This includes both the interface of the 
dimer and the N-terminus itself which is partly more remote 
from the interface. For example, both the G89C/S91C disul- 
fide bond at the interface and the A5C/E84C disulfide bond at 
the N-terminus are highly stabilizing. The residues of the N- 
terminal S3C-V102C disulfide bond are located > 12 A away 
from the dimer interface, suggesting that its increased thermo- 
stability can be unrelated to improved stability of this dimer 
interface. 

Recent strategies for protein stabilization often only select 
the most flexible residues for mutagenesis (Reetz et al, 2006; 
Jochens et al, 2010; Joo et al, 2011), with the rationale that 
these should be the most critical residues. However, some of 
the highly stabilizing mutations found here are in a rigid part 
of the protein. For example, T85V (A7^ p +7°C) is close to 
the flexible N-terminus, even though the mutated residue is in 
a rigid part of the protein as judged by its B-factors (13 A , in 
X-ray structure 1NWW), which are lower than average (15 A 2 
is the average B-factor in 1NWW) and much lower than those 
of the flexible N-terminus (He 4, 27 A"). Thus, the computa- 
tional methods generated stabilizing mutations that would 
have been missed if only highly flexible residues had been 
selected for mutagenesis. 

An essential element of the FRESCO strategy is the elimin- 
ation of mutations that are suggested by the computational pro- 
tocols, but lack credibility when their predicted flexibility is 
taken into account or when their predicted structure is exam- 
ined. Structural inspection showed that about 50% of the 
mutations predicted to be stabilizing by the initial calculations 
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(Step 1) are probably false positives. They were discarded in 
Step 2 (Scheme 1) because they introduce structural features 
that are expected to destabilize the protein, such as water- 
exposed hydrophobic side chains. The latter is a known 
problem of computational design algorithms (Jacak et al, 
2012), and this justifies the use of rational criteria to remove 
false positives that result from imperfect energy functions and 
sampling in the design algorithms (Kellogg et al, 2011; 
Leaver-Fay et al, 2013). It is common in computational 
design to eliminate variants that have clear structural problems 
(Kiss et al, 2013; Wijma and Janssen, 2013). In the future, this 
could be automated as has been done for finding errors in 
X-ray structures (Vriend, 1990), or may no longer be necessary 
if the energy calculations and sampling methods are further 
improved. Since local unfolding followed by irreversible ag- 
gregation may be as important for enzyme inactivation as 
overall thermodynamic stability (Polizzi et al, 2006; Reetz 
et al, 2006; Joo et al, 2011), elimination of false positives 
was also based on MD simulations which predicted effects on 
local flexibility (Step 3). It is well established that high flexi- 
bility can promote unfolding (Vihinen, 1987). Here, experi- 
mental characterization of mutants that were eliminated at the 
third step of FRESCO because of higher flexibility showed 
that the discarded mutations were not stabilizing and often 
were even strongly destabilizing (Fig. 1C). 

The efficiency of FRESCO as a strategy is confirmed by the 
large number of mechanistically different stabilizing muta- 
tions that were discovered. The point mutations appear to act 
through various effects that can stabilize a protein, including 
the removal of unsatisfied H-bond donors/acceptors, introduc- 
tion of new H-bonds, better charge distribution (Karshikoff 
and Ladenstein, 2001; Gribenko et al, 2009), less hydrophobic 
exposure to solvent and entropic stabilization (Nosoh and 
Sekiguchi, 1991; Eij sink et al, 2004). Multiple disulfide bonds 
per protein, like in variants Fl and F2, occur naturally in the 
proteomes of a few thermophiles (Ladenstein and Ren, 2008). 
A complete list of the mutations and their proposed effects is 
given in Supplementary data. The ability to obtain mechanis- 
tically diverse types of stabilizing mutations is likely to 
become essential if the goal is to engineer strongly enhanced 
stability into any target protein. 

The developed computational strategy to stabilize an 
enzyme is reminiscent of directed evolution, in that a library 
of potentially stabilizing mutations is experimentally screened 
before combining the most successful mutations to final var- 
iants. A more common approach in computational design of 
thermostability is to select the best set of mutations purely in 
silico and only characterize the final combined variants 
(Malakauskas and Mayo, 1998; Korkegian et al, 2005; 
Gribenko et al. , 2009 ; Diaz et al. , 20 1 1 ; Joo et al. , 20 1 1 ; Borgo 
and Havranek, 2012; Miklos et ai, 2012; Murphy et ai, 2012). 
The results in Fig. IB show that such an approach would have 
missed highly stabilizing mutations (T85V/N92K). Another 
approach is to use the consensus approach in combination with 
computational design. Using FoldX for the computations, 
such an approach resulted in a cellobiohydrolase with a 9°C 
improved 7^ p (Komor et al, 2012). However, in a similar 
study it was reported that FoldX did not correlate well with 
thermostabilizing mutations (Polizzi et al. , 2006), which is in 
agreement with our results of finding false positives in the 
absence of orthogonal screening (Fig. 1C). To allow for larger 
increases in thermostability, the FRESCO approach uses an 



experimental screening to verify that the mutations indeed sta- 
bilize the enzyme and spare catalytic activity before creating 
variants in which mutations are combined. 

The modeling of backbone flexibility is an important 
problem in computational protein design. With a rigid back- 
bone, many beneficial mutations will be sterically excluded. 
The unusually large number of stabilizing disulfide bonds dis- 
covered in this study is mainly due to the use of an MD simu- 
lation that samples the natural backbone flexibility to generate 
different realistic starting structures for the design of disulfide 
bonds. Backbone sampling protocols normally do not incorp- 
orate explicit water molecules (Su and Mayo, 1997; Georgiev 
et al, 2008; Smith and Kortemme, 2008; Havranek and Baker, 
2009; Babor et al, 2011; Chitsaz and Mayo, 2013). The MD 
simulations include the surrounding water hydrogen-bonding 
network, which should make the sampling of energetically ac- 
cessible conformations more accurate. This protocol produced 
7 out of the 10 successful disulfide bonds, which included all 
three disulfide bonds that were combined in the final highly 
thermostable variants (Table I, Fig. 2). We are not aware of 
previous reports describing a similar large number of stabiliz- 
ing disulfide bonds. With existing methods to stabilize 
enzymes, typically one or two stabilizing disulfide bonds are 
reported (Matsamura et al, 1989; Pikkemaat et al, 2002; 
Dombkowski, 2003; Chen et al, 2009). Such numbers are 
similar to the finding of three stabilizing disulfide bonds for 
LEH (Table I) in the absence of backbone conformational 
sampling. The significant increase in the number of stabilizing 
disulfide bonds due to the conformational sampling experi- 
mentally shows that MD can generate structures that are accur- 
ate enough for computational protein design. 

The backbone sampling allowed to predict stabilizing disul- 
fide bonds at positions, where based on the X-ray structure a 
disulfide bond would not be feasible because the backbone 
atoms were too far away from each other. For example, in case 
of disulfide bonds distances of 3.6-7.2 A occur between their 
respective C a atoms (Petersen et al, 1999) in natural proteins, 
whereas the distance between the C a atoms of residues 4 and 
82 (where a stabilizing disulfide bond could be formed, 
Table I, Fig. 2) is at least 8.90 in the available X-ray structures 
(1NWW, 1NU3). During the MD simulation, the distance 
between the C a atoms of residues 4 and 82 decreased to 
6.52 A (results not shown). Without backbone conformational 
sampling, the additional disulfide bonds obtained from MD 
simulation could only have been discovered if the geometric 
criteria would have been relaxed. However, in that case the 
algorithms would also have proposed disulfide bonds that are 
expected to destabilize the protein because their geometries 
are far outside the naturally occurring ranges. 

The multi-site mutants, which harbored 12 (variant Fl) or 
10 (variant F2) substitutions, are fully catalytically active, 
with an increase in k cat at the optimum temperature when com- 
pared with WT. The only precaution adopted in the FRESCO 
protocol was not to introduce mutations at residues close to the 
active site. Regioselectivity of water attack on the diastereo- 
meric substrate is fully retained, allowing enantioconvergent 
production of (15',2/?,4/?)-limonene diol. The resulting var- 
iants are suitable for use in protein engineering aimed at intro- 
ducing new selectivities. 

In conclusion, we show that computational library design 
can identify many mutations with different stabilization 
mechanisms to cumulatively obtain a large increase in enzyme 
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thermostability. The computational library design enabled a 
larger jump in enzyme stability while preserving catalytic ac- 
tivity. The developed FRESCO strategy made it feasible to 
obtain protein variants with high thermostability in a short 
time with minimal experimental screening. 

Supplementary data 

Supplementary data are available at PEDS online. 
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